74 research outputs found

    Optimal Stroke Learning with Policy Gradient Approach for Robotic Table Tennis

    Full text link
    Learning to play table tennis is a challenging task for robots, as a wide variety of strokes required. Recent advances have shown that deep Reinforcement Learning (RL) is able to successfully learn the optimal actions in a simulated environment. However, the applicability of RL in real scenarios remains limited due to the high exploration effort. In this work, we propose a realistic simulation environment in which multiple models are built for the dynamics of the ball and the kinematics of the robot. Instead of training an end-to-end RL model, a novel policy gradient approach with TD3 backbone is proposed to learn the racket strokes based on the predicted state of the ball at the hitting time. In the experiments, we show that the proposed approach significantly outperforms the existing RL methods in simulation. Furthermore, to cross the domain from simulation to reality, we adopt an efficient retraining method and test it in three real scenarios. The resulting success rate is 98% and the distance error is around 24.9 cm. The total training time is about 1.5 hours

    Sample-efficient Reinforcement Learning in Robotic Table Tennis

    Full text link
    Reinforcement learning (RL) has achieved some impressive recent successes in various computer games and simulations. Most of these successes are based on having large numbers of episodes from which the agent can learn. In typical robotic applications, however, the number of feasible attempts is very limited. In this paper we present a sample-efficient RL algorithm applied to the example of a table tennis robot. In table tennis every stroke is different, with varying placement, speed and spin. An accurate return therefore has to be found depending on a high-dimensional continuous state space. To make learning in few trials possible the method is embedded into our robot system. In this way we can use a one-step environment. The state space depends on the ball at hitting time (position, velocity, spin) and the action is the racket state (orientation, velocity) at hitting. An actor-critic based deterministic policy gradient algorithm was developed for accelerated learning. Our approach performs competitively both in a simulation and on the real robot in a number of challenging scenarios. Accurate results are obtained without pre-training in under 200200 episodes of training. The video presenting our experiments is available at https://youtu.be/uRAtdoL6Wpw.Comment: accepted at ICRA 2021 (Xian, China

    A Deep-learning Real-time Bias Correction Method for Significant Wave Height Forecasts in the Western North Pacific

    Full text link
    Significant wave height is one of the most important parameters characterizing ocean waves, and accurate numerical ocean wave forecasting is crucial for coastal protection and shipping. However, due to the randomness and nonlinearity of the wind fields that generate ocean waves and the complex interaction between wave and wind fields, current forecasts of numerical ocean waves have biases. In this study, a spatiotemporal deep-learning method was employed to correct gridded SWH forecasts from the ECMWF-IFS. This method was built on the trajectory gated recurrent unit deep neural network,and it conducts real-time rolling correction for the 0-240h SWH forecasts from ECMWF-IFS. The correction model is co-driven by wave and wind fields, providing better results than those based on wave fields alone. A novel pixel-switch loss function was developed. The pixel-switch loss function can dynamically fine-tune the pre-trained correction model, focusing on pixels with large biases in SWH forecasts. According to the seasonal characteristics of SWH, four correction models were constructed separately, for spring, summer, autumn, and winter. The experimental results show that, compared with the original ECMWF SWH predictions, the correction was most effective in spring, when the mean absolute error decreased by 12.972~46.237%. Although winter had the worst performance, the mean absolute error decreased by 13.794~38.953%. The corrected results improved the original ECMWF SWH forecasts under both normal and extreme weather conditions, indicating that our SWH correction model is robust and generalizable.Comment: 21 page

    Effect of arabinogalactan protein complex content on emulsification performance of gum arabic

    Get PDF
    The emulsification properties of the standard (STD), matured (EM2 and EM10) and fractionated gum arabic samples via phase separation induced molecular fractionation were investigated to find out how the content of arabinogalactan protein (AGP) complex affects the resulting emulsion properties. Phase separation and the accompanying molecular fractionation were induced by mixing with different hydrocolloids including hyaluronan (HA), carboxymethyl cellulose (CMC), and maltodextrin (MD). Increase of AGP content from 11 to 28% resulted in the formation of emulsions with relatively smaller droplet sizes and better stability. Further increase in the AGP content to 41% resulted in the formation of emulsions with larger droplets. In spite of the larger droplets sizes, these emulsions were extremely stable. In addition, the emulsions prepared with GA higher AGP content better stability in the presence of ethanol. The results indicate that AGP content plays a vital role in emulsion stability and droplet size

    Interfacial and emulsifying properties of the electrostatic complex of β-lactoglobulin fibril and gum Arabic (Acacia Seyal)

    Get PDF
    Formation, interfacial and emulsifying properties of the electrostatic complex of β-lactoglobulin fibril (BLGF) and gum Arabic Acacia Seyal (AS) were investigated. Necklace-like soluble complex could be formed at pH 3.5, and its charge and interfacial properties depended on the BLGF content. With appropriate amount of BLGF (< 9.09 wt.%), the formed complex possessed a good dispersibility and surface activity. When excessive BLGF (9.09∼50 wt.%) existed, surface charge of the complex was gradually neutralized and aggregation occurred. Homogeneous oil-in-water emulsions could be stabilized by the complex and the droplet size decreased with increasing BLGF content. Higher content of BLGF (9.09∼50 wt.%) was detrimental for emulsification due to the aggregation of complex, and the formed emulsion tended to flocculate. Compared with AS, the complex formed emulsions were much more stable against heating (90 ℃, 30 min) and salting (200 mM NaCl) environments, and the emulsions were stable during long-term storage (46 days). Proposed mechanisms for the adsorption of BLGF/AS complex at the oil-water interface. Pure AS (i) could adsorb at the oil-water interface but formed a loose film due to its poor surface activity and insufficient adsorption amount. With addition of a small amount of fibrils (ii), soluble electrostatic complexes are formed and they can be adsorbed at the interface to formed a dense viscoelastic film due to the surface activity of the BLGF. With a higher content of fibrils (iii), surface charge of the complex tended to be neutralized, causing the aggregation. Because the presence of protein fibrils, they could also adsorb at the oil-water interface to produce a viscoelastic film. However, with a bigger size and irregular shape, the aggregates were difficult to array at the interface as densely as the soluble complex

    Effects of temperature and solvent condition on phase separation induced molecular fractionation of gum arabic/hyaluronan aqueous mixtures

    Get PDF
    Effects of temperature and solvent condition on phase separation-induced molecular fractionation of gum arabic/hyaluronan (GA/HA) mixed solutions were investigated. Two gum arabic samples (EM10 and STD) with different molecular weights and polydispersity indices were used. Phase diagrams, including cloud and binodal curves, were established by visual observation and GPC-RI methods. The molecular parameters of control and fractionated GA, from upper and bottom phases, were measured by GPC-MALLS. Fractionation of GA increased the content of arabinogalactan-protein complex (AGP) from ca. 11% to 18% in STD/HA system and 28% to 55% in EM10/HA system. The phase separation-induced molecular fractionation was further studied as a function of temperature and solvent condition (varying ionic strength and ethanol content). Increasing salt concentration (from 0.5 to 5 mol/L) greatly reduced the extent of phase separation-induced fractionation. This effect may be ascribed to changes in the degree of ionization and shielding of the acid groups. Increasing temperature (from 4oC to 80oC) also exerted a significant influence on phase separation-induced fractionation. The best temperature for GA/HA mixture system was 40oC while higher temperature negatively affected the fractionation due to denaturation and possibly degradation in mixed solutions. Increasing the ethanol content up to 30% showed almost no effect on the phase separation induced fractionation

    Sensor Fusion and Stroke Learning in Robotic Table Tennis

    Get PDF
    Research on robotic table tennis is attractive for studying diverse algorithms in many fields, such as object detection, robot learning, and sensor fusion, as table tennis is full of challenges in terms of speed and spin. In this thesis, we focus on optimal stroke learning with sensor fusion for a KUKA industrial manipulator. Four high-speed cameras and an IMU are used for object pose detection. To learn an optimal stroke for the robot, a novel policy gradient approach is proposed. Firstly, we develop a multi-camera calibration approach for wide-baseline camera pairs. The initial intrinsic and extrinsic transformations are computed using the classic calibration methods, resulting in a 3D position error of 15.0 mm for four cameras (11.0 mm for each stereo pair) in our test dataset. A novel loss function is proposed to post-optimize them with a new set of pattern images from each camera. The final accuracy is 3.2 mm for stereo cameras and 2.5 mm for four cameras. To efficiently use those cameras, we divide them into two stereo-camera pairs for the ball and racket detection, respectively. With the well-calibrated cameras, the 3D position of the ball can be triangulated when the pixel positions of the ball center are determined with two different approaches: color thresholding and two layers CNN. Secondly, we propose an optimal stroke learning approach for teaching the robot to play table tennis. A realistic simulation environment is built for the ball’s dynamics and the robot’s kinematics. The learning strategy is decomposed into two stages: the ball hitting state prediction and the optimal stroke learning. Based on the controllable and applicable actions in our robot, a multi-dimensional reward function and QQ-value model are proposed. The comparison with other RL methods is performed using an evaluation dataset of 1000 balls in simulation. An efficient retraining approach is proposed to close the sim-to-real gap. The testing experiments in reality show that the robot can successfully return the ball to the desired target with an error of around 24.9 cm and a success rate of 98% in three different scenarios. Instead of training the policy in simulation, another option is initializing it with the actions of a human player and the corresponding state of the ball. To get the human actions, we directly detect the racket from images and estimate its 6D pose using two proposed approaches: traditional image processing with two cameras and deep learning by fusing one camera and an IMU. The experiment shows the latter method outperforms the former in terms of robustness for both the black and red sides of the racket. The former method is 1.9 cm better in position (2.8 cm versus 4.7 cm), but much slower in speed when the detection head is replaced with YOLOv4. Finally, a behavior cloning experiment is performed to reveal the potential of this work
    • …
    corecore